Back

Virus Evolution

26 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
GPAS: an online AI system for rapid and accurate pathogen identification and LLM-based interpretation
2026-02-20 public and global health 10.64898/2026.02.18.26346517
Top 0.2% (2.0%)
Show abstract

Accurate identification of unknown pathogens is critical for medicine and public health, yet current metagenomic workflows remain heavily dependent on specialized bioinformatics expertise and manual interpretation, creating substantial bottlenecks in time-sensitive diagnostic settings1. The key challenges lie in achieving precise species identification amidst high background noise and translating complex microbial data into clinically actionable insights2,3. Here we present the Global Pathogen A...

2
Herpes simplex virus genomes from an under-sampled population in Namibia reveal novel genetic diversity
2026-02-19 epidemiology 10.64898/2026.02.18.26346525
Top 0.3% (1.9%)
Show abstract

Herpes simplex virus (HSV) is an endemic pathogen, infecting most adults world-wide. HSV infection can cause a wide spectrum of disease outcomes, ranging from asymptomatic infection or mild lesions to rare cases of infectious keratitis, encephalitis, and death. HSV genome sequences have been shown to differ between individual patients, as well as within individuals. To date, the vast majority of publicly available HSV genomic data has come from Europe and North America. Our current understanding...

3
Genomic, antigenic and transmission dynamics of influenza A(H3N2) subclade K in Canada, early 2025/26 season
2026-02-12 infectious diseases 10.64898/2026.02.10.26345998
Top 0.4% (1.6%)
Show abstract

Influenza A(H3N2) subclade K virus was detected in Canada early in the 2025/26 influenza season, bearing an antigenic transition in the hemagglutinin (HA) glycoprotein. Analysis of 396 HA sequences from Canada showed antigenic divergence from 2025/26 influenza vaccine strains, consistent with partial mismatch. Phylodynamic analysis revealed sustained pre-vaccine transmission without clear post-vaccine expansion. Phylogenetic and phylogeographic analyses indicated interprovincial mixing within a ...

4
Genomic surveillance of Lassa virus in Guinea through in-country sequencing
2026-03-05 infectious diseases 10.64898/2026.03.04.26347418
Top 0.5% (1.5%)
Show abstract

Strengthening in-country sequencing capacity generated 28 Lassa virus genomes from human clinical cases, expanding our knowledge of Lassa fever in Guinea. Phylogeographic analysis revealed cross-border exchange between Liberia and the NZerekore region, and a Sierra Leone introduction into the Gueckedou area. Enhanced genomic surveillance is crucial to guide future public health actions.

5
Metagenomic strain tracking reveals patterns of bacterial spread and the impact of water chlorination
2026-02-11 infectious diseases 10.64898/2026.02.08.26345864
Top 0.7% (1.2%)
Show abstract

Bacterial infections are a major cause of morbidity and mortality among children under five in low- and middle-income countries (LMICs). Children in LMICs are exposed to and colonized by a range of pathogenic bacteria, yet patterns of bacterial exchange between humans are not well known, in part because culturing and sequencing single bacterial isolates is labor-intensive. Here, we apply a machine learning strain tracking approach to metagenomic data from 511 stool samples from children and moth...

6
Outburst of serotype 4 IPD after COVID-19 is driven by ST15063/GPSC162 lineage associated with high-risk behaviors and greater virulence linked to influenza H3N2 virus coinfection and cigarette smoke
2026-03-04 infectious diseases 10.64898/2026.02.27.26346872
Top 0.8% (1.0%)
Show abstract

The emergence of vaccine covered serotypes causing invasive pneumococcal disease (IPD) is a serious concern worldwide. We investigated the unexpected rise of serotype 4 causing IPD primarily in non-vaccinated young adults after the COVID-19 pandemic that further spread to adults [≥] 65 years in recent years. For this purpose, we conducted a retrospective study of serotype 4 IPD cases (n=827) reported in Spain between 2009 and 2024. Whole-genome sequencing was performed to assess clonal lineag...

7
Inferring Respiratory Disease Biology from Geolocation Data
2026-03-05 infectious diseases 10.64898/2026.03.05.26347578
Top 0.8% (1.0%)
Show abstract

Biological fitness quantifies the efficiency and selective advantage of pathogens and hosts in their bilateral interaction. Key questions--such as how much more infectious an emerging variant is compared with its predecessor, or how much protection vaccination offers relative to no vaccination--require fitness to be measured systematically, in real time, and ideally beyond controlled laboratory settings. We propose an approach that infers biological fitness from mostly non-biological data on inf...

8
Benchmarking HLA genotyping from whole-genome sequencing across multiple sequencing technologies
2026-02-12 health informatics 10.64898/2026.02.10.26345621
Top 0.8% (1.0%)
Show abstract

BackgroundThe hyperpolymorphic nature and structural complexity of the human leukocyte antigen (HLA) genomic region present challenges for accurate and scalable typing across diverse sample types. While wholegenome sequencing (WGS) offers the opportunity to infer HLA genotypes without targeted enrichment, systematic benchmarks across sequencing platforms, biospecimens and coverage levels remain limited. ResultsWe assembled a multi-platform resource of WGS datasets derived from short-read (Illum...

9
Seasonal vaccine-induced immunity shows preserved cross-reactivity to H3N2 subclade K in adults
2026-02-18 infectious diseases 10.64898/2026.02.18.26346502
Top 1% (0.8%)
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWInfluenza A subclade K viruses caused high infection rates in the 2025/2026 Northern Hemisphere season, raising concerns about antigenic drift and reduced vaccine effectiveness. We measured antibody responses in matched human pre- and post-vaccination sera against a vaccine-like as well as subclade K isolates. Pre-existing immunity to subclade K variants was noted with seasonal influenza vaccination boosting titers two-fold against subclade K and three-fold against the va...

10
No evidence for a classic transmission-duration tradeoff in human malaria infections
2026-02-09 infectious diseases 10.64898/2026.02.01.26345288
Top 1% (0.7%)
Show abstract

Pathogenic organisms are typically thought to be constrained by a tradeoff between the rate and duration of transmission, an assumption that underpins a considerable body of evolutionary theory. Here we test for a transmission-duration tradeoff using detailed historical malaria infection data from an era prior to widespread use of antibiotics when humans were deliberately infected with malaria parasites as treatment for neurosyphilis (malariatherapy). These time series follow individual human in...

11
Mapping the specificity of H3N2 strain-specific and cross-reactive human neutralizing antibodies elicited by the 2025-2026 influenza vaccine
2026-02-22 infectious diseases 10.64898/2026.02.20.26346746
Top 1% (0.7%)
Show abstract

An H3N2 variant, named subclade K, continues to circulate widely during the 2025-2026 influenza season. This virus possesses a hemagglutinin (HA) protein that has eleven substitutions relative to the HA of the Northern Hemisphere 2025-2026 H3N2 vaccine strain. Many of these substitutions are in epitopes in well-characterized HA antigenic sites. Despite this, interim vaccine effectiveness studies indicate that the 2025-2026 influenza vaccine provides moderate protection against H3N2 subclade K in...

12
Temporal trends in Plasmodium vivax diversity in eastern Cambodia evidence declining transmission
2026-03-04 infectious diseases 10.64898/2026.03.03.26346840
Top 1% (0.7%)
Show abstract

BackgroundElimination of Plasmodium vivax is challenging due to its dormant liver stages (hypnozoites), which can reactivate weeks or months after the primary infection, causing relapses and ongoing transmission of the parasite. Despite these challenges, P. vivax clinical case numbers have declined over the past decade in Cambodia. We used parasite genotyping to assess whether the decline in case numbers was reflected in parasite diversity and relatedness as a proxy to transmission. MethodsGeno...

13
The Representativeness of Regional Influenza Virus Genomic Surveillance for National Trends in the United States
2026-03-02 infectious diseases 10.64898/2026.02.23.26346422
Top 1% (0.7%)
Show abstract

Genomic surveillance of influenza viruses informs vaccine strain selection and evolutionary forecasting. Sequencing efforts vary widely across U.S. states, which raises concerns about spatial sampling bias. We evaluated how well 10,958 influenza virus genomes sampled by our group in Michigan captured the genetic diversity in 34,743 genomes circulating nationally from the 2021/22 through 2024/25 seasons. We defined seasonal hemagglutinin haplotypes and tracked their detection across states. A sma...

14
Novel transposon Tn8026 acts as a global driver of transmissible linezolid resistance in Enterococcus via a linear plasmid
2026-03-04 infectious diseases 10.64898/2026.03.04.26347163
Top 2% (0.6%)
Show abstract

Linezolid is a critical last-resort antimicrobial for multidrug-resistant Enterococcus faecium, particularly against vancomycin-resistant lineages where therapeutic options are severely limited. While resistance has historically arisen through de novo chromosomal mutations, the global emergence of transferable resistance mechanisms threatens to render more infections untreatable. Here, we characterise a recent (2023-2024) hospital-associated outbreak of linezolid-resistant E. faecium in Queensla...

15
Population immunity to clade 2.3.4.4b H5N1 is dominated by anti-neuraminidase antibodies
2026-02-12 infectious diseases 10.64898/2026.02.10.26346014
Top 2% (0.5%)
Show abstract

Clade 2.3.4.4b highly pathogenic avian influenza A(H5N1) viruses continue to expand geographically and across mammalian hosts, raising concern about pandemic potential. The degree and specificity of pre-existing immunity in humans are key determinants of this risk. We analyzed hemagglutinin (HA)-and neuraminidase (NA)-specific antibody responses in 300 sera collected from adults in New York City. While HA directed binding antibodies to clade 2.3.4.4b H5 were low and hemagglutination-inhibiting a...

16
Interplay of Immunity, Climate, and Viral Evolution Explains Semiannual SARS-CoV-2 Dynamics with Implications for Control
2026-03-02 epidemiology 10.64898/2026.02.27.26347213
Top 2% (0.5%)
Show abstract

In the three years since Omicron emergence, SARS-CoV-2 dynamics have exhibited persistent twice-yearly waves in the United States, peaking in late summer and winter, with heterogeneity in timing and intensity across states. This semiannual pattern sharply contrasts with typical annual respiratory pathogen dynamics in the US, yet their underlying mechanisms and whether this pattern will persist remain poorly understood. Here, we tested several hypothesized mechanisms and found that a combination ...

17
Assessing Resistome Host Range Across Water Reclamation in Three Geographically Distinct Communities using Hi-C Sequencing
2026-02-16 epidemiology 10.64898/2026.02.12.26346186
Top 2% (0.5%)
Show abstract

Antimicrobial resistance (AMR) is a growing problem, with annual deaths set to pass 10 million by 2050 if current trends continue. Wastewater surveillance has been proposed as a strategy to understand population-level resistance, and water reclamation facilities (WRFs) have been identified as a control point for environmental dissemination of resistant bacteria. Understanding dynamics of AMR across WRFs requires advanced molecular tools that elucidate host bacteria, especially for mobile resista...

18
Mapping spatial colleague connectivity patterns from individual-level registry data to inform regional pandemic interventions
2026-02-20 infectious diseases 10.64898/2026.02.19.26346499
Top 2% (0.5%)
Show abstract

A concern in infectious disease modelling is how accurately population mixing is incorporated, as it shapes the type and frequency of contacts through which infection spreads, and consequently, estimated intervention effectiveness. Although synthesizing mixing patterns from diary-based surveys is an established framework, geographical information is poorly or sparsely captured. Here we propose a generalizable workflow to quantify geographical connectivity from job registry data covering over 8 m...

19
Self-reported health history from 70,724 individuals reveals novel HLA associations with allergy and other frequently underreported conditions
2026-02-19 genetic and genomic medicine 10.64898/2026.02.18.26346586
Top 2% (0.5%)
Show abstract

BackgroundVariation in the HLA loci, located on human chromosome 6p, has been associated with hundreds of diseases and conditions. However, high levels of polymorphism that characterize the HLA system, coupled with generally modest effect sizes for most phenotypes, necessitate relatively large sample sizes to power association studies; meanwhile, high resolution HLA genotyping remains relatively resource intensive. These constraints limit identification of novel associations. While phenome-wide ...

20
Gut microbiome and metabolome reveal hormone-related and functional alterations in ER-positive breast cancer: a case-control study
2026-02-09 epidemiology 10.64898/2026.02.06.26345778
Top 2% (0.4%)
Show abstract

The gut microbiome has been linked to breast cancer, largely through microbial functions involved in estrogen metabolism (the "estrobolome"); however, specific microbial targets remain poorly defined in human studies. Here, we profiled the gut microbiome using whole-metagenome shotgun sequencing, and plasma and stool metabolites were quantified using targeted metabolomics, in a study of 70 postmenopausal female cases with treatment-naive ER-positive breast cancer and 70 controls. Reduced species...